Real-time audio-to-score alignment of singing voice based on melody and lyric information
نویسندگان
چکیده
Singing voice is specific in music: a vocal performance conveys both music (melody/pitch) and lyrics (text/phoneme) content. This paper aims at exploiting the advantages of melody and lyric information for real-time audio-to-score alignment of singing voice. First, lyrics are added as a separate observation stream into a template-based hidden semi-Markov model (HSMM), whose observation model is based on the construction of vowel templates. Second, early and late fusion of melody and lyric information are processed during real-time audio-to-score alignment. An experiment conducted with two professional singers (male/female) shows that the performance of a lyrics-based system is comparable to that of melody-based score following systems. Furthermore, late fusion of melody and lyric information substantially improves the alignment performance. Finally, maximum a posteriori adaptation (MAP) of the vowel templates from one singer to the other suggests that lyric information can be efficiently used for any singer.
منابع مشابه
Music Information Retrieval from a Singing Voice Based on Verification of Recognized Hypotheses
Several music information retrieval (MIR) systems have been developed which retrieve musical pieces by the user’s singing voice. All of these systems use only melody information for retrieval, although lyrics information is also useful for retrieval. In this paper, we propose an MIR system that uses both melody and lyrics information in the singing voice. The MIR system verifies hypotheses outp...
متن کاملAn automatic singing transcription system with multilingual singing lyric recognizer and robust melody tracker
A singing transcription system which transcribes human singing voice to musical notes is described in this paper. The fact that human singing rarely follows standard musical scale makes it a challenge to implement such a system. This system utilizes some new methods to deal with the issue of imprecise musical scale of input voice of a human singer, such as spectral standard deviation used for n...
متن کاملAutomatic Alignment of Music Audio and Lyrics
This paper proposes an algorithm for aligning singing in polyphonic music audio with textual lyrics. As preprocessing, the system uses a voice separation algorithm based on melody transcription and sinusoidal modeling. The alignment is based on a hidden Markov model speech recognizer where the acoustic model is adapted to singing voice. The textual input is preprocessed to create a language mod...
متن کاملAnalyzing the influence of pitch quantization and note segmentation on singing voice alignment in the context of audio-based Query-by-Humming
Query-by-Humming (QBH) systems base their operation on aligning the melody sung/hummed by a user with a set of candidate melodies retrieved from polyphonic songs. While MIDI-based QBH builds on the premise of existing annotated transcriptions for any candidate song, audiobased research makes use of melody estimation algorithms for the songs. In both cases, a melody abstraction process is requir...
متن کاملAutomatic scoring of singing voice based on melodic similarity measures
A method for automatic assessment of singing voice is proposed. Such method quantifies in a meaningful way the similarity between the user performance and a reference melody. A set of melodic similarity measures comprising intonation and rhythmic aspects have been implemented for this goal. Such measure implement different MIR techniques, such as melodic transcription or score alignment. The re...
متن کامل